68 research outputs found

    Cross-Domain Labeled LDA for Cross-Domain Text Classification

    Full text link
    Cross-domain text classification aims at building a classifier for a target domain which leverages data from both source and target domain. One promising idea is to minimize the feature distribution differences of the two domains. Most existing studies explicitly minimize such differences by an exact alignment mechanism (aligning features by one-to-one feature alignment, projection matrix etc.). Such exact alignment, however, will restrict models' learning ability and will further impair models' performance on classification tasks when the semantic distributions of different domains are very different. To address this problem, we propose a novel group alignment which aligns the semantics at group level. In addition, to help the model learn better semantic groups and semantics within these groups, we also propose a partial supervision for model's learning in source domain. To this end, we embed the group alignment and a partial supervision into a cross-domain topic model, and propose a Cross-Domain Labeled LDA (CDL-LDA). On the standard 20Newsgroup and Reuters dataset, extensive quantitative (classification, perplexity etc.) and qualitative (topic detection) experiments are conducted to show the effectiveness of the proposed group alignment and partial supervision.Comment: ICDM 201

    Exploring the Confounding Factors of Academic Career Success: An Empirical Study with Deep Predictive Modeling

    Full text link
    Understanding determinants of success in academic careers is critically important to both scholars and their employing organizations. While considerable research efforts have been made in this direction, there is still a lack of a quantitative approach to modeling the academic careers of scholars due to the massive confounding factors. To this end, in this paper, we propose to explore the determinants of academic career success through an empirical and predictive modeling perspective, with a focus on two typical academic honors, i.e., IEEE Fellow and ACM Fellow. We analyze the importance of different factors quantitatively, and obtain some insightful findings. Specifically, we analyze the co-author network and find that potential scholars work closely with influential scholars early on and more closely as they grow. Then we compare the academic performance of male and female Fellows. After comparison, we find that to be elected, females need to put in more effort than males. In addition, we also find that being a Fellow could not bring the improvements of citations and productivity growth. We hope these derived factors and findings can help scholars to improve their competitiveness and develop well in their academic careers

    DPR: An Algorithm Mitigate Bias Accumulation in Recommendation feedback loops

    Full text link
    Recommendation models trained on the user feedback collected from deployed recommendation systems are commonly biased. User feedback is considerably affected by the exposure mechanism, as users only provide feedback on the items exposed to them and passively ignore the unexposed items, thus producing numerous false negative samples. Inevitably, biases caused by such user feedback are inherited by new models and amplified via feedback loops. Moreover, the presence of false negative samples makes negative sampling difficult and introduces spurious information in the user preference modeling process of the model. Recent work has investigated the negative impact of feedback loops and unknown exposure mechanisms on recommendation quality and user experience, essentially treating them as independent factors and ignoring their cross-effects. To address these issues, we deeply analyze the data exposure mechanism from the perspective of data iteration and feedback loops with the Missing Not At Random (\textbf{MNAR}) assumption, theoretically demonstrating the existence of an available stabilization factor in the transformation of the exposure mechanism under the feedback loops. We further propose Dynamic Personalized Ranking (\textbf{DPR}), an unbiased algorithm that uses dynamic re-weighting to mitigate the cross-effects of exposure mechanisms and feedback loops without additional information. Furthermore, we design a plugin named Universal Anti-False Negative (\textbf{UFN}) to mitigate the negative impact of the false negative problem. We demonstrate theoretically that our approach mitigates the negative effects of feedback loops and unknown exposure mechanisms. Experimental results on real-world datasets demonstrate that models using DPR can better handle bias accumulation and the universality of UFN in mainstream loss methods
    • …
    corecore